Estimating lower vocal tract features with closed-open phase spectral analyses
نویسندگان
چکیده
Previous studies have shown that, in addition to being speaker-dependent yet context-independent, lower vocal tract acoustics significantly impact the speech spectrum at mid-tohigh frequencies (e.g 3-6kHz). The present work automatically estimates spectral features that exhibit acoustic properties of the lower vocal tract. Specifically aiming to capture the cyclicity property of the epilarynx tube, a novel multi-resolution approach to spectral analyses is presented that exploits significant differences between the closed and open phases of a glottal cycle. A prominent null linked to the piriform fossa is also estimated. Examples of the feature estimation on natural speech of the VOICES multi-speaker corpus illustrate that a salient spectral pattern indeed emerges between 3-6kHz across all speakers. Moreover, the observed pattern is consistent with that canonically shown for the lower vocal tract in previous works. Additionally, an instance of a speaker’s formant (i.e. spectral peak around 3kHz that has been well-established as a characteristic of voice projection) is quantified here for the VOICES template speaker in relation to epilarynx acoustics. The corresponding peak is shown to be double the power on average compared to the other speakers (20 vs 10 dB).
منابع مشابه
Modeling of the glottal flow derivative waveform with application to speaker identification
Speech production has long been viewed as a linear filtering process, as described by Fant in the late 1950's [10]. The vocal tract, which acts as the filter, is the primary focus of most speech work. This thesis develops a method for estimating the source of speech, the glottal flow derivative. Models are proposed for the coarse and fine structure of the glottal flow derivative, accounting for...
متن کاملOn the Use of a Spectral Glottal Model for the Source-filter Separation of Speech
The estimation of glottal flow from a speech waveform is a key method for speech analysis and parameterization. Significant research effort has been made to dissociate the first vocal tract resonance from the glottal formant (the low-frequency resonance describing the open-phase of the vocal fold vibration). However few methods cope with estimation of high-frequency spectral tilt to describe th...
متن کاملGlottal-based analysis of the lombard effect
The Lombard effect refers to the speech changes due to the immersion of the speaker in a noisy environment. Among these changes, studies have already reported acoustic modifications mainly related to the vocal tract behaviour. In a complementary way, this paper investigates the variation of the glottal flow in Lombard speech. For this, the glottal flow is estimated by a closed-phase analysis an...
متن کاملDC-constrained linear prediction for glottal inverse filtering
Closed phase covariance (CP) analysis is a glottal inverse filtering method which estimates the vocal tract during the glottal closed phase. Since closed phase durations are typically short, the vocal tract computation with linear prediction is vulnerable to the covariance frame position. This study proposes a modified CP algorithm in which a DC-gain constraint is imposed in optimizing the line...
متن کاملSource-filter separation of speech signal in the phase domain
Deconvolution of the speech excitation (source) and vocal tract (filter) components through log-magnitude spectral processing is well-established and has led to the well-known cepstral features used in a multitude of speech processing tasks. This paper presents a novel source-filter decomposition based on processing in the phase domain. We show that separation between source and filter in the l...
متن کامل